Generalized bucketization scheme for flexible privacy settings

نویسندگان

  • Ke Wang
  • Peng Wang
  • Ada Wai-Chee Fu
  • Raymond Chi-Wing Wong
چکیده

Bucketization is an anonymization technique for publishing sensitive data. The idea is to group records into small buckets to obscure the record-level association between sensitive information and identifying information. Compared to the traditional generalization technique, bucketization does not require a taxonomy of attribute values, so is applicable to more data sets. A drawback of previous bucketization schemes is the uniform privacy setting and uniform bucket size, which often results in a non-achievable privacy goal or excessive information loss if sensitive values have variable sensitivity. In this work, we present a flexible bucketization scheme to address these issues. In the flexible scheme, each sensitive value can have its own privacy setting and buckets of different sizes can be formed. The challenge is to determine proper bucket sizes and group sensitive values into buckets so that the privacy setting of each sensitive value can be satisfied and overall information loss is minimized. We define the bucket setting problem to formalize this requirement. We present two efficient solutions to this problem. The first solution is optimal under the assumption that two different bucket sizes are allowed, and the second solution is heuristic without this assumption. We experimentally evaluate the effectiveness of this generalized bucketization scheme. © 2016 Elsevier Inc. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Privacy And Data Utility For High- Dimensional Data By Using Anonymization Technique

Privacy Preserving is one of the significant methods in data mining to hide the sensitive information. Anonymization techniques like generalization and bucketization have been used for privacy preserving. The main problem with generalization is it is not applicable for high-dimensional data and bucketization technique does not avoid membership disclosure. Slicing is one of the novel techniques ...

متن کامل

Efficient Techniques for Preserving Microdata Using Slicing

Privacy preserving publishing is the kind of techniques to apply privacy to collected vast amount of data. One of the recent problem prevailing is in the field of data publication. The data often consist of personally identifiable information so releasing such data consists of privacy problem. Several anonymization techniques such as generalization and bucketization have been designed for priva...

متن کامل

Segmenting: A New-Fangled Advance to Isolation Conserving Facts Distributing

Re-identification is a major privacy threat to public datasets containing individual records. Many privacy protection algorithms rely on generalization and suppression of “quasiidentifier" attributes such as ZIP code and birthdate. Several anonymization techniques, such as generalization and bucketization, have been designed for privacy preserving micro data publishing. Recent work has shown th...

متن کامل

A Survey on Privacy Preservation in Data Publishing

Abstract— Privacy preservation is the most concentrated issue in data publishing, as the sensitive information should not be leaked. For this sake, several techniques such as generalization, bucketization are proposed, in order to deal with privacy preservation. However, generalization fails on high dimensional data because of dimensionality and it causes information loss due to uniform distrib...

متن کامل

A centralized privacy-preserving framework for online social networks

There are some critical privacy concerns in the current online social networks (OSNs). Users' information is disclosed to different entities that they were not supposed to access. Furthermore, the notion of friendship is inadequate in OSNs since the degree of social relationships between users dynamically changes over the time. Additionally, users may define similar privacy settings for their f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Sci.

دوره 348  شماره 

صفحات  -

تاریخ انتشار 2016